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■ Abstract 

0^ | Given a A:-uniform hyper-graph, the Efc- Vertex-Cover problem is to find the smallest subset 

of vertices that intersects every hyper-edge. We present a new multilayered PCP construction 

i—i, that extends the Raz verifier. This enables us to prove that Efc- Vertex- Cover is NP-hard to 

Q^) ' approximate within factor (k — 1 — e) for any k > 3 and any e > 0. The result is essentially 

tight as this problem can be easily approximated within factor k. Our construction makes use 
c/2 , of the biased Long-Code and is analyzed using combinatorial properties of s-wise i-intersecting 

i families of subsets. 
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1 1 Introduction 

m ! 

A fc-uniform hypergraph H = (V, E) consists of a set of vertices V and a collection E of fe-element 
subsets of V called hyperedges. A vertex cover of H is a subset S C V such that every hyperedge 
in E intersects S, i.e., e n S ^ for each e £ E. An independent set in G is a subset whose 
complement is a vertex cover, or in other words a subset of vertices that contains no hyperedge 
entirely within it. The E k- Vertex- Cover problem is the problem of finding a minimum size vertex 
cover in a /c-uniform hypergraph. This problem is alternatively called the minimum hitting set 
problem with sets of size k (and is equivalent to the set cover problem where each element of the 
universe occurs in exactly k sets). 

The EA;- Vertex-Cover problem is a fundamental NP-hard optimization problem which arises 
in numerous settings. For k = 2, it is just the famous vertex cover problem on graphs. Owing 
to its NP-hardness, one is interested in how well it can be approximated in polynomial time. A 
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very simple algorithm that is invariably taught in a typical undergraduate algorithms class is the 
following: greedily pick a maximal set of pairwise disjoint hyperedges and then include all vertices in 
the chosen hyperedges in the vertex cover. It is easy to show that this gives a factor k approximation 
algorithm for E/c- Vertex-Cover. State of the art techniques yield only a tiny improvement, achieving 
a k — o(l) approximation ratio This raises the question whether achieving an approximation 
factor of k — e for any constant e > could be NP-hard. 

In this paper, we prove a nearly tight hardness result for E/c- Vertex-Cover. Specifically, we 
prove that E/c- Vertex-Cover is indeed NP-hard to approximate within factor (k — 1 — e) for any 
e > 0, thus explaining why no efficient algorithm with performance guarantee much better than k 
has been found. 

Previous Hardness Results 

The vertex-cover problem on hypergraphs where the size of the hyperedges is unbounded is nothing 
but the Set-Cover problem. For this problem there is a Inn approximation algorithm |2()|I18|. and a 
matching (1 — o(l)) In n hardness result due to Feige |Sj. The first explicit hardness result shown for 
E/c- Vertex-Cover was due to Trevisan |23| who considered the approximability of bounded degree 
instances of several combinatorial problems, and specifically showed an inapproximability factor of 
A; 1 / 19 for E/c- Vertex-Cover. Holmerin ^1 showed that E4- Vertex-Cover is NP-hard to approximate 
within (2 — e). Independently, Goldreich jlUj showed a direct 'FGLSS'-type [5] reduction (involving 
no use of the long-code, a crucial component in most recent PCP constructions) attaining a hardness 
factor of (2 — e) for E/c- Vertex-Cover for some constant k. Later, Holmerin |17j showed that E/c- 
Vertex-Cover is NP-hard to approximate within a factor of k 1 ~ £ , and also that it is NP-hard to 
approximate E3- Vertex-Cover within factor (3/2 — e). 

Somewhat surprisingly, more recently Dinur, Guruswami and Khot gave a fairly simple proof 
of an a ■ k hardness result for E/c- Vertex-Cover, (for some a > 3). The proof takes a combinatorial 
view of Holmerin's construction and instead of Fourier analysis uses some properties concerning 
intersecting families of finite sets. The authors also give a more complicated reduction that shows 
a factor (k — 3 — e) hardness for E/c- Vertex-Cover. The crucial impetus for that work came from the 
recent result of Dinur and Safra [7j on the hardness of approximating vertex cover (on graphs), and 
as in [7j the notion of biased long codes and some extremal combinatorics relating to intersecting 
families of sets play an important role. In addition to ideas from [J], the factor (k — 3 — e) 
hardness result also exploits the notion of covering complexity introduced by Guruswami, Hastad 
and Sudan Both the a • k and the k — 3 — e results have not been published (an ECCC 

manuscript exists, since they have been subsumed by the work presented herein. 

Our result and techniques 

In this paper we improve upon all the above hardness results by proving a factor {k — 1 — e) 
inapproximability result for E/c- Vertex-Cover. Already for k = 3, this is an improvement from 
1.5 — e to 2 — e. Extending our result from k — 1 — eto/c — e appears highly non-trivial and 
in particular would imply a factor 2 — e hardness for vertex-cover on graphs, a problem that is 
notoriously difficult. While our proof shares some of the extremal combinatorics flavor of [7] and 
[3], it draws its strength mainly from a new multilayered outer verifier system for NP languages. 
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This multilayered system is constructed using the Raz verifier |21j as a building block. 

The Raz verifier, which serves as the starting point or "outer verifier" in most if not all recent 
hardness results, can be described as follows. There are two sets of (non-Boolean) variables Y and 
Z, and for certain pairs of y G Y and z G Z, a constraint iTy^ z . The constraints are projections, 
i.e., for each assignment to y there exists exactly one assignment to z such that the constraint % y -+ z 
is satisfied. The goal is to find an assignment A to the variables so that a maximum number of 
constraints satisfied, i.e., have the property ir y -, z (A(y)) = A(z). The PCP Theorem (21^ 

along with the Parallel Repetition Theorem imply that for any e > it is NP-hard to distinguish 
between the case where all the constraints can be satisfied and the case where no more than a 
fraction e of the constraints can be satisfied. 

In the a ■ k hardness result is obtained by replacing every Y variable by a block of vertices 
(representing its Long-Code). Hyperedges connect yi-vertices to y2-vertices only if there is some 
z G Z such that 7r yi -^ z ,'Ky 2 ^. z are constraints in the system. This construction has an inherent 
symmetry between blocks which deteriorates the projection property of the constraints, limiting 
the hardness factor one can prove to at most k/2. 

Another way of reducing the Raz verifier to Vertex-Cover is by maintaining the asymmetry 
between Y and Z, introducing a block of vertices for each variable in Y and in Z (representing their 
Long-Code). Each constraint Tt y -^ z can be emulated by a set of hyperedges, where each hyperedge 
consists of both y-vertices and z-vertices. The hyperedges can be chosen so that if the initial PCP 
instance was satisfiable, then taking a certain 1/k of the vertices in each block will be a vertex- 
cover. However, this reduction has a basic 'bipartiteness' flaw: the underlying constraint graph, 
being bipartite with parts Y and Z, has a vertex cover of size at most one half of the number of 
vertices. Taking all the vertices of, say, the Z variables will be a vertex cover for the hypergraph 
regardless of whether or not the initial PCP instance was satisfiable. This, once again, limits the 
gap to no more than k/2. 

We remark that this 'bipartiteness' flaw naturally arises in other settings as well. One example 
is approximate hypergraph coloring, where indeed our multilayered PCP construction has been 
successfully used for showing hardness, see |S1 IT9|. 

The Multilayered PCP. We overcome the k/2 limit by presenting a new, multilayered PCP. In 
this construction we maintain the projection property of the constraints that is a strong feature 
of the Raz verifier, while overcoming the 'bipartiteness' flaw. In the usual Raz verifier we have 
two 'layers', the first containing the Y variables and the second containing the Z variables. In 
the multilayered PCP, we have I layers containing variables X±,X2, . . . ,Xi respectively. Between 
every pair of layers %\ and 12, we have a set of projection constraints that represent an instance of 
the Raz verifier. In the multilayered PCP, it is NP-hard to distinguish between (i) the case where 
there exists an assignment that satisfies all the constraints (between every pair of layers), and (ii) 
the case where for every ii,i2 it is impossible to satisfy more than a fraction e of the constraints 
between and Xi 2 . 

In addition, we prove that the underlying constraint graph no longer has the 'bipartiteness' 
obstacle, i.e. it no longer has a small vertex cover and hence a large independent set. Indeed we 
show that the multilayered PCP has a certain 'weak-density' property: for any set containing an e 
fraction of the variables there are many constraints between variables of the set. This guarantees 
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that "fake" independent sets in the hypergraph (i.e., independent sets that occur because there 
are no constraints between the variables of the set) contain at most e of the vertices. Hence, the 
minimum vertex cover must contain vertices in almost all of the blocks. 

We mention that the PCP presented by Feige in |Sj has a few structural similarities with ours. 
Most notably, both have more than two types of variables. However, while in our construction the 
types are layered with decreasing domain sizes, in Feige's construction the different types are all 
symmetric. Furthermore, and more importantly, the constraints tested by the verifier in Feige's 
construction are not projections while this is a key feature of our multilayered PCP, crucially 
exploited in our analysis. 

We view the construction of the multilayered PCP as a central contribution of our paper, and 
believe that it could be a powerful tool to reduce from in other hardness of approximation results as 
well. In fact, as mentioned above, our multilayered construction has already been used in obtaining 
strong hardness results for coloring 3-uniform hypergraphs Elj (namely the hardness of coloring 
a 2-colorable 3-uniform hypergraph using an arbitrary constant number of colors), a problem for 
which no non-trivial inapproximability results are known using other techniques. We anticipate 
that this new outer verifier will also find other applications besides the ones in this paper and in 

in. 

The Biased Long-Code 

Our hypergraph construction relies on the Long-Code that was introduced in [5J, and more specif- 
ically, on the biased Long-Code defined in [7j. Thus, each PCP variable is represented by a block 
of vertices, one for each 'bit' of the biased Long-Code. More specifically, in x's block we have one 
vertex for each subset of R, where R is the set of assignments for the variable x. However, rather 
than taking all vertices in a block with equal weight, we attach weights to the vertices according 
to the p-biased Long-Code. The weight of a subset F is set to p' F '(l — p)\ R \ F \ highlighting subsets 
of cardinality p ■ \R\. Thus we actually construct a weighted hypergraph which can then be easily 
translated, by appropriate duplication of vertices, to a non- weighted one (see, e.g., 0). 

The vertex cover in the hypergraph is shown to have relative size of either 1 — p in the good case 
or almost 1 in the bad case. Choosing large p = 1 — k _\_ £ , yields the desired gap of j^— « k — 1 — e 
between the good and bad cases. The reduction uses the following property: a family of subsets of 
a set R, where each subset has size p\R\, either contains very few subsets, or it contains some k — 1 
subsets whose common intersection is very small. We will later show that this property holds for 
p < 1 — jrzj and therefore we obtain a gap of k — 1 — e. As can be seen, this property does not 
hold for p > 1 — 7-zj and therefore one cannot improve the k — 1 — e result by simply increasing p. 

Location of the gap 

All our hardness results have the gap between sizes of the vertex cover at the "strongest" location. 
Specifically, to prove a factor (k — 1 — s) hardness we show that it is hard to distinguish between 
/j-uniform hypergraphs that have a vertex cover of weight ^rry +e from those whose minimum vertex 
cover has weight at least (1 — e). This result is stronger than a gap of about (k — 1) achieved, for 
example, between vertex covers of weight ^ k \y2 and ^-j-. In fact, by adding dummy vertices, our 
result implies that for any c < 1 it is NP-hard to distinguish between hypergraphs whose minimum 
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vertex-cover has weight at least c from those which have a vertex-cover of weight at most (ttzt + e)- 
Put another way, our result shows that for fc-uniform hypergraphs, for k > 3, there is a fixed a such 
that for arbitrarily small e > 0, it is NP-hard to find an independent set consisting of a fraction 
e of the vertices even if the hypergraph is promised to contain an independent set comprising a 
fraction a of vertices. We remark that such a result is not known for graphs and seems out of reach 
of current techniques. (The recent 1.36 hardness result for vertex cover on graphs due to Dinur and 
Safra 7 , for example, shows that it is NP-hard to distinguish between cases when the graph has 
an independent set of size 0.38 • n and when no independent set has more than 0.16 • n vertices.) 

Organization 

We begin in Section [2] by developing the machinery from extremal combinatorics concerning in- 
tersecting families of sets that will play a crucial role in our proof. In Section [3] we present the 
multilayered PCP construction. In Section we present our reduction to a gap version of E£> 
Vertex-Cover which allows us to prove a factor (k—l — e) inapproximability result for this problem. 

2 Intersecting Families 

In this section we describe certain properties of s-wise t-intersecting families. For a comprehensive 
survey, see [13]. Denote [n] = {0, 1, . . . , n - 1} and 2^ = {F | F C [n]}. 

Definition 2.1 A family T C 2^ is called s-wise i-intersecting if for every s sets F±, . . . , F s G T , 

we have 

|Fin...nF a [ >t. 

We are interested in bounding the size of such families, and for this purpose it is useful to 
introduce the notion of a left-shifted family. Performing an (i, j)-shift on a family consists of 
replacing the element j with the element i in all sets F G T such that j G F, i ^ F and (F \ {j}) U 
{i} $l J-. A left-shifted family is a family which is invariant with respect to (i,j)-shifts for any 
1 < i < j < n. For any family J 7 , by iterating the (z,j)-shift for all 1 < i < j < n we eventually 
get a left-shifted family which we denote by S(J-). The following simple lemma summarizes the 
properties of the left-shift operation (see, e.g., ^Hl> P- 1298, Lemma 4.2): 

Lemma 2.2 For any family T C 2^- n \ there exists a one-to-one and onto mapping r from T to 
S(J-) such that \F\ = \t(F)\ for every F G T . In other words, left-shifting a family maintains its 
size and the size of the sets in the family. Moreover, if J- is an s-wise t-intersecting family then so 
is S{T). m 

The next lemma states that a subset F in a left-shifted s-wise t-intersecting family, cannot be 
'sparse' on all of its prefixes F C)[t + js], Vj > 0. 

Lemma 2.3 ([13J, p. 1311, Lemma 8.3) Let J 7 be a left-shifted s-wise t-intersecting family. 
Then, for every F G T , there exists a j > with \F C)[t + sj]\ > t + (s — ■ 

Definition 2.4 For a bias parameter < p < 1, and a ground set R, the weight of a set F C R is 

4{F)^p\ F \.{l-p)\ R \ F \ 
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When R is clear from the context we write fi p for n^. The weight of a family T C 2 R is fip{J-) = 

The weight of a subset is precisely the probability of obtaining this subset when one picks every 
element in R independently with probability p. 

The following is the main lemma of this section. It shows that for any p < s= ^, a family of 
non-negligible // p -weight (i.e., ^ P (T) > e) cannot be s-wise t-intersecting for sufficiently large t. 

Lemma 2.5 For any e,s,p with p < ^-j-, there exists a t = t(e,s,p) such that for any s-wise 
t-intersecting family J- C 2^ n \ ^ P {J-) < e. 

Proof: The proof follows from Lemma 12.31 (see |13| . p. 1311, Theorem 8.4). Let T be an s-wise 
t-intersecting family where t will be determined later. According to Lemma [2 .21 S(J-) is also s-wise 
t-intersecting and // p (5(^ 7 )) = ^{T). By Lemma [231 for every F € S(!F), there exists a j > such 
that \F n [t + sj] \ > t + (s — l)j. We can therefore bound [i p {S{!F)) from above by the probability 
that such a j exists for a random set chosen according to the distribution [/,„. We now prove an 
upper bound on this probability which will give the desired bound on ^(S^)) and hence also on 

Let 5 = — p. Then, for any j > 0, Pr[ \F Pi{t + sj]\ > t + (s — l)j ] is at most 
Pr[ \FH[t + sj] \ -p{t + sj) > 5{t + sj) } < e -^ t+sj)s2 . 
by the Chernoff bound j^. Summing over all j > we get: 

lh{S{Fj) <Y^e- 2(t+s ^ 2 = e- 2t52 /(l - e- 2s&2 ) 

which is smaller than e for large enough t. ■ 

3 The Multilayered PCP 

3.1 Starting Point - The PCP Theorem and the Parallel Repetition Theorem 

As is the case with many inapproximability results (e.g., [3], [O], ^E], |22])> we begin our reduction 
from the Raz verifier described next. Let f be a collection of two-variable constraints, where the 
variables are of two types, denoted Y and Z. Let Ry denote the range of the Y-variables and 
Rz the range of the Z- variables, where \Rz\ < | ^Ry | 1 . Assume each constraint tt € $ depends on 
exactly one y £ Y and one z S Z, furthermore, for every value a y € Ry assigned to y there is 
exactly one value a z E Rz to z such that the constraint tt is satisfied. Therefore, we can write each 
constraint n £ as a function from Ry to Rz, and use notation ir y ^ z : Ry — > Rz- Furthermore, 
we assume that the underlying constraint graph is bi-regular, i.e., every Y- variable appears in the 
same number of constraints in ^, and every Z- variable appears in the same number of constraints 
in \I>. 

The following theorem follows by combining the PCP Theorem with Raz's Parallel Repetition 
Theorem. The PCP given by this theorem will be called the Raz's verifier henceforth. 

1 Readers familiar with the Raz verifier may prefer to think concretely of Ry = [7 U ] and Rz = [2 U ] for some 
number u of repetitions. 
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Theorem 3.1 (PCP Theorem 0/ + Raz's Parallel Repetition Theorem WTp Let ^ be as 
above. There exists a universal constant 7 > such that for every (large enough) constant \Ry\ it 
is NP-hard to distinguish between the following two cases: 

• Yes : There is an assignment A : Y —* Ry, A : Z — > Rz such that all tt € \P are satisfied 
by A, i.e., W y ^ z € *, ir y ^ z (A(y)) = A(z). 

• No : No assignment can satisfy more than a fraction r^~p of the constraints in^. ■ 

As discussed in the introduction, a natural approach to build a hypergraph from the PCP ^ 
is to have a block of vertices for every variable y or z and define hyperedges of the hypergraph so 
as to enforce the constraints Tr y ^ z . For every constraint Tr y ^, z , there will be hyperedges containing 
vertices from the block of y and the block of z. However, this approach is limited by the fact that 
the constraint graph underlying the PCP has a small vertex cover. Since each hyperedge contains 
vertices from both the Y and Z 'sides', the subset of all vertices on the Y (resp. Z) 'side', already 
covers all of the hyperedges regardless of whether the initial PCP system was satisfiable or not. 2 

This difficulty motivates our construction of a multilayered PCP where we have many types of 
variables (rather than only Y and Z) and the resulting hypergraph is multipartite. The multilayered 
PCP is able to maintain the properties of Theorem 13.11 between every pair of layers. Moreover, 
the underlying constraint graph has a special 'weak-density' property that roughly guarantees it 
will have only tiny independent sets (thus any vertex cover for it must contain almost all of the 
vertices) . 

3.2 Layering the Variables 

Let l,R > 0. Let us begin by defining an /-layered PCP. In an /-layered PCP there are / sets of 
variables denoted by X±, . . . ,X[. The range of variables in Xi is denoted Ri, with \Ri\ = R°W . 
For every 1 < i < j < I there is a set of constraints &ij where each constraint tt £ $jj depends 
on exactly one iGX; and one x' £ Xj. For any two variables we denote by tx x -*x' the constraint 
between them if such a constraint exists. Moreover, the constraints in $jj are projections from x to 
x', that is, for every assignment to x there is exactly one assignment to x' such that the constraint 
is satisfied. 

In addition, as mentioned in the introduction, we would like to show a certain 'weak-density' 
property of our multilayered PCP: 

Definition 3.2 An l-layered PCP is said to be weakly-dense if for any 5 > 0, given m > [~|] layers 
i\ < ... < i m and given any sets Sj C Xj. for j E [m] such that Sj > there always exist 

two sets Sj and Sj> such that the number of constraints between them is at least a 4- fraction of 
the constraints between the layers X{. and X^., . 

2 Adding hyperedges entirely within vertices on the Y and Z sides cannot help either since we wish to ensure a 
small vertex cover in the completeness case. Hence picking all vertices on, say, the Z side, together with the small 
vertex cover that hits all edges entirely within the Y side (such a small cover must exist due to the completeness 
case) will again give a vertex cover of weight close to 1/2. 
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Theorem 3.3 There exists a universal constant 7 > 0, such that for any parameters l,R, there is 
a weakly-dense l-layered PCP U<£jj such that it is NP-hard to distinguish between the following two 
cases: 

• Yes : There exists an assignment that satisfies all the constraints. 

• No : For every i < j, not more than 1/R 1 of the constraints in <&ij can be satisfied by an 
assignment. 

Proof: Let f be a constraint-system as in Theorem lH.il We construct <3? = U<&jj as follows. The 
variables X% of layer i G [I] are the elements of the set Z l x Y , i.e., all /-tuples where the first i 
elements are Z variables and the last I — i elements are Y variables. The variables in layer i have 
assignments from the set Ri = (RzY x (Ry) corresponding to an assignment to each variable 
of ^ in the /-tuple. It is easy to see that \Ri\ < R°^ for any i G [I] and that the total number of 
variables is no more than |^|°®. For any 1 < i < j < I we define the constraints in $jj as follows. 
A constraint exists between a variable xi G Xi and a variable Xj G Xj if they contain the same 
variables in the first i and the last I — j elements of their /-tuples. Moreover, for any i < k < j 
there should be a constraint in vl/ between Xi k and Xj k- More formally, denoting X{ = (xn, ...,Xij) 
for Xi G Xi = Z i X Y l -\ 



Xi G X% , Xj G Xj ; 
VA; G [I] \ {i + 1, . . . ,j},x i)k = x jjk 



As promised, the constraints projections. Given an assignment a = (a±,..,ai) G R4 

to Xi, we define the consistent assignment b = (b±,..,bi) G Rj to xj as b k = n Xi k -* x - k ( a k) ^ or 
k G {i + 1, . . . , j} and b k = a k for all other k. 

The completeness of $ follows easily from the completeness of ^. That is, assume we are given 
an assignment A : Y U Z — > Ry U Rz that satisfies all the constraints of Then, the assignment 
B : [J Xi — > |J i2j defined by £?(a;i ... 2;/) = (^4(xi) . . . is a satisfying assignment. 

For the soundness part, assume that there exist two layers i < j and an assignment B that 
satisfies more than a 1/ R? fraction of the constraints in $ ij . We partition Xi into classes such that 
two variables in Xi are in the same class iff they are identical except possibly on coordinate j. The 
variables in Xj are also partitioned according to coordinate j. Since more than 1/i? 7 of the con- 
straints in &ij are satisfied, it must be the case that there exist a class x^i, . . . , Xij-i, Xij + ±, . . . , Xij 
in the partition of Xi and a class Xj i, . . . , Xj j—%, Xjj+x, . . . , Xjj in the partition of Xj between which 
there exist constraints and the fraction of satisfied constraints is more than \/R? . We define an 
assignment to \t as 

Mv) = (B(xi,i, . . . ,Xij-i,y,Xij+i, . . .,x it i))j 



for y G Y and as 



A(z) = (B(x jfl , Xj y j-i, z, Xjj+i, . . . , x jy i))j 



for z G Z . Notice that there is a one-to-one and onto correspondence between the constraints in 
\& and the constraints between the two chosen classes in <£. Moreover, if the constraint in $ is 
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satisfied, then the constraint in is also satisfied. Therefore, A is an assignment to \I/ that satisfies 
more than \/W of the constraints. 

To prove that this multilayered PCP is weakly-dense, we recall the bi-regularity property men- 
tioned above, i.e., each variable y £ Y appears in the same number of constraints and also each 
z £ Z appears in the same number of constraints. Therefore, the distribution obtained by uniformly 
choosing a variable y £ Y and then uniformly choosing one of the variables in z £ Z with which it 
has a constraint is a uniform distribution on Z. 

Take any m = [~|] layers i\ < . . . < i m and sets Sj C Xj. for j £ [m] such that Sj > 5|Xj . |. 
Consider a random walk beginning from a uniformly chosen variable X\ € X\ and proceeding to a 
variable X2 £ X2 chosen uniformly among the variables with which x\ has a constraint. The random 
walk continues in a similar way to a variable x% £ X3 chosen uniformly among the variables with 
which X2 has a constraint and so on up to a variable in X[. Denote by Ej the indicator variable of 
the event that the random walk hits an Sj variable when in layer Xi- . From the uniformity of it 
follows that for every j, Pr[£j] > 5. Moreover, using the inclusion-exclusion principle, we get: 



1 > Pr[V^]>E Pr ^']-E Pr ^ A ^ 

j j<k 

2 f m\ 

2 jmax j<k Pi[E j AE k \ 



which implies 

/m\ 6 2 

max j<fc Pr[^ AE k ]>l/{\> — 

Fix j and k such that Pt[Ej A E^] > and consider a shorter random walk beginning from 
a random variable in Xj. and proceeding to the next layer and so on until hitting layer i k . Since 
Ej is uniform on Xi j we still have that Pt[Ej A Ek] > ^- where the probability is taken over the 
random walks between X{. and X^ k . Also, notice that there is a one-to-one and onto mapping from 
the set of all random walks between Xj. and X^ k to the set ^i jt i k - Therefore, at least a fraction 
of the constraints between Xj. and Xj fe are between Sj and Sk, which completes the proof of the 
weak-density property. ■ 



4 The Hypergraph Construction 

Theorem 4.1 (Main Theorem) For any k > 3 it is NP-hard to approximate the vertex-cover 
on a k-uniform hypergraph within any constant factor less than k — 1. 

Proof: Fix k > 3 and arbitrarily small e > 0. Define p = 1 — , _\_ £ - Let $ be a PCP instance 
with layers Xi, . . . , X;, as described in Theorem l3.31 with parameters I = 32e -2 and R large enough 
to be chosen later. We present a construction of a fc-uniform hypergraph G = (V,E). We use the 
Long Code introduced by Bellare et al. 31. A Long Code over domain R has one bit for every 
subset uC/J, An encoding of element x £ R assigns bit-value 1 to the sets v s.t. x £ v and assigns 
to the sets which do not contain x. In the following, the bits in the Long Code will be vertices 
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of the hypergraph. The vertices that correspond to a bit-value are (supposedly) the vertices of a 
Vertex Cover. 

Vertices. For each variable x in layer Xi we construct a block of vertices V\x\. This block 
contains a vertex for each subset of Ri. Throughout this section we slightly abuse notation by 
writing a vertex rather than the set it represents. The weight of the vertices inside the block V[x] 
is according to i.e. the weight of a subset v C Ri is proportional to ^(v) = p^(l — p)\ Ri \ v \ 
as in Definition 12.41 All blocks in the same layer have the same total weight and the total weight 
of each layer is j. Formally, the weight of a vertex v E V[x] where x E Xi is given by 

Hyperedges. We construct hyperedges between blocks V[x] and V[y] such that there exists a 
constraint ir x -> y - We connect a hyperedge between any Ui, . . . , vjfc-i E V[x] and w E V"[y] whenever 
Tfic-n/dnti ^i) n n = 0. 

Let IS(G) denote the weight of vertices contained in the largest independent set of the hyper- 
graph G. 

Lemma 4.2 (Completeness) If is satisfiable then IS(G) > p. 

Proof: Let A be a satisfying assignment for <£, i.e., A maps each i E [I] and x E Xi to an 
assignment in R4 such that all the constraints are satisfied. Let icy contain in the block V[x] 
all the vertices that contain the assignment A(x), 

1= \J{ V E V[x] I v 3 A(x)} . 

We claim that Z is an independent set. Take any vi, v^—i inZnV[x] and a vertex u inXnV[y]. 
The vertices vi, . . . , intersect on A(x) and therefore the projection of their intersection contains 
7r x _y(A(x)) = A{y). Since u is in In V[y] it must contain A(y). The proof is completed by noting 
that inside each block, the weight of the set of all vertices that contain a specific assignment is 
exactly p. ■ 

We now turn to the soundness of the construction. 
Lemma 4.3 (Soundness) If IS(G) > e then $ is satisfiable. 

This lemma completes the proof of our main result since the ratio between the sizes of the vertex 
cover in the yes and no cases is = (1 — £)(& — 1 — e) which can be arbitrarily close to k — 1. 
Proof: Let Z be an independent set of weight e. We consider the set X' of all variables x for 
which the weight of X D V[sc] in V[x] is at least e/2. A simple averaging argument shows that the 
weight of UxeX' is a t least |. Another averaging argument shows that in at least |i = | 
layers, V contains at least | fraction of the variables. Using the weak-density property of the 
PCP (see Definition 13. 2|) . we conclude that there exist two layers Xi and Vj such that |^ fraction 
of the constraints between them are constraints between variables in X' . Let us denote by X the 
variables in Xi n X' and by Y" the variables in Xj n V. 
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For any variable consider the vertices in In V[x\. According to Lemma 12 . 51 there exists 

a t = k — l,p) and k — 1 vertices in In V[x] that intersect in less than t assignments. We 
denote these vertices by v x ,i, ■ ■ ■ and their intersection by B(x). 

In the following we define an assignment to the variables in X and Y such that many of the 
constraints between them are satisfied. Then Theorem would imply that $ must be satisfiable 
(provided R is chosen large enough). For a variable x £ X we choose a random assignment from 
the set B{x). For a variable y € Y we choose the assignment 

A(y) = maxvax aeRY \{x £ X \ a £ ir x ^ y (B(x))}\, 

i.e., the assignment that is contained in the largest number of projections of B(x). 
Before continuing, we need the following simple claim: 

Claim 4.4 Let A±, . . . ,A n be a collection of n sets of size at most m such that no element is 
contained in more than k sets. Then, there are at least > -g^ disjoint sets in this collection. 

Proof: We prove by induction on n that there are at least ppprjrrw disjoint sets in the collection. 
The claim holds trivially for n < 1 + (k — l)m. Otherwise, consider all the sets that intersect A\. 
Since no element is contained in more than k sets, the number of such sets (including A\) is at 
most 1 + (k — l)m. Removing these sets we get, by using the induction hypothesis, a collection 
that contains n ~_^^~^ m = 1+ ^ fc "l 1 ) m — 1 disjoint sets. We conclude the induction step by adding 
A% to the disjoint sets. ■ 

Consider a variable y GY and a variable x such that the constraint ir x ^y exists. There are no 
hyperedges of the form (v X: i, ■ ■ ■ ,v X: k-i,u) for any vertex u £ In V[y]. Therefore, every vertex 
u € In V[y] must intersect ir x ^ y (B(x)). Now consider the family of projections ir x ^ y {B(x)) for all 
the variables x such that the constraint 7T x ^ y exists. Let q denote the maximum number of disjoint 
sets inside this family. Note that every disjoint set reduces the weight of the vertices in In V[y] 
by a factor of 1 — (1 —p)*. Because the weight of In V[y] is at least f , we obtain that q is at most 
log(|)/log(l — (1 — pY). Claim [Ol implies that there exists an assignment for y that is contained 
in at least a fraction 



tlog(|)/log(l-(l-p)*) 

of the projections ir x ^ y (B(x)). Therefore, the expected fraction of constraints satisfied between X 
and Y is at least 



t 2 log(|)/log(l-(l-p)') 

which is a constant that does not depend on R. We complete the proof by choosing the range R 
of the PCP large enough so that this fraction is larger than 1/i? 7 and applying Theorem l3.31 This 
completes the soundness proof. ■ 
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